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I.  INTRODUCTION 


A.  BACKGROUND 

Military  convoys,  also  referred  to  as  Combat  Logistic  Patrols  (CLPs),  have 
always  been  lucrative  targets  for  adversaries  that  do  not  have  the  military  advantage  to 
sustain  and  win  in  direct  conflict.  The  safety  of  a  convoy  is  often  seen  as  a  function  of 
active  measures  employed  by  the  convoy  such  as  speed,  increased  aggressiveness  and 
patrolling.  To  this  end,  Matthew  Hakola  used  agent  based  modeling  and  principal 
component  analysis  to  investigate  which  variables  carry  more  weight  in  the  success  of  the 
convoy  [1].  However,  when  faced  with  multiple  routes  a  commander  must  make 
decisions  based  on  the  threat  associated  with  each  route  before  choosing  which  one  to 
take.  William  Ruckle’s  geometric  approach  to  the  game  theory  behind  a  hunter  and  prey 
has  been  useful  in  understanding  the  foundation  of  our  game  [2], 

Our  convoy,  which  we  will  name  Blue,  wishes  to  traverse  an  area  but  faces  the 
threat  of  being  intercepted  between  the  start  and  finish  of  his  route  by  an  enemy  lying  in 
wait.  We  will  call  the  enemy  Red.  Using  Ruckle’s  example,  we  will  assume  our  area  is  a 
unit  square,  with  a  horizontal  x-axis  and  vertical  v-axis.  We  will  assume  that  neither  Red 
nor  Blue  receive  any  intelligence  on  the  position  of  the  other  once  Blue  starts  to  move. 
Because  of  this  limitation  Red  gains  no  advantage  from  changing  locations  while  Blue  is 
in  motion  and  so  we  will  consider  Red’s  position  fixed.  Blue  simply  wishes  to  get  fromx 
=  0  to  x  =  1.  With  this  in  mind  we  can  disregard  Red’s  x-coordinate,  as  it  does  not  affect 
his  ability  to  intercept  Blue.  Instead,  we  focus  on  Red’s  ability  to  intercept  Blue  as  a 
function  of  his  v-coordinate  and  range  of  his  weapon,  which  we  will  define  as  r. 
Therefore,  our  unit  box  can  now  be  decomposed  into  routes  of  Mr  width.  Blue  gains  no 
advantage  by  changing  routes  while  traversing  through  the  area  but  can  change  routes 
with  each  new  crossing.  Using  this  logic,  we  limit  our  game  to  the  use  of  individual 
routes  versus  a  network  of  routes. 
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The  limitation  with  most  ambush  models  is  that  the  threat  is  always  one  sided. 
This  is  evident  in  Ruckle’s  examples  [2],  as  well  as  in  textbook  examples  on  game  theory 
[6].  In  reality,  an  ambusher  must  also  contend  with  the  fact  that  he  may  be  discovered 
before  he  gets  the  opportunity  to  conduct  an  ambush.  Furthermore,  the  threat  is  not 
always  from  an  active  searcher  (also  referred  to  as  direct  detection).  Rewards  and 
humanitarian  assistance  can  influence  an  area  to  be  more  pro-active  in  uncovering  and 
turning  in  cells  that  are  planning  ambushes  (also  referred  to  as  indirect  detection).  So,  the 
game  becomes  more  complex  with  each  side  trying  to  maximize  their  own  survival  time 
while  minimizing  their  opponent’s  time  considering  both  indirect  and  direct  detection. 

We  approach  this  model  using  a  “deductive”  search  methodology  following  work 
done  by  Owen  and  McCormick  [3].  This  search  methodology  focuses  on  determining 
those  routes  most  favorable  to  each  player  rather  than  trying  to  follow  any  “trail”  left  by 
their  presence.  Their  algorithm  provides  us  with  an  expected  time  to  capture  /  ambush 
for  both  players.  This  time  is  dependent  on  the  probability  of  both  direct  and  indirect 
detection  as  well  as  the  ability  to  successfully  change  routes.  Then,  given  the  expected 
time  to  capture  /  ambush  for  each  player  along  each  route,  we  use  non-linear 
programming  to  detennine  the  optimal  strategy  for  each  player  to  adopt  in  order  to 
maximize  individual  survival  time. 

B.  RESEARCH  GOAL 

The  goal  of  this  paper  is  to  provide  a  way  to  investigate  and  detennine  the  mixed 
strategies  used  by  both  the  convoys  and  adversaries  along  a  relatively  small  number  of 
routes.  This  will  allow  commanders  to  make  informed  decisions  on  which  routes  to  take 
to  maximize  the  survival  rate  of  CLPs  as  well  as  which  routes  are  most  likely  to  be 
ambush  sites. 

C.  BASIC  OVERALL  MODEL 

The  approach  we  will  use  is  that  of  a  bimatrix  game  with  the  goal  of  finding 

equilibrium  points.  Our  players  are  defined  as  Blue,  for  the  convoy,  and  Red,  for  the 

ambusher.  Each  player  has  their  own  payoff  matrix  where  their  goal  is  to  maximize  their 

survival  time  while  their  adversary  simultaneously  tries  to  minimize  it.  We  define  'F  as 
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the  payoff  matrix  for  the  Search  Model  where  Red  wishes  to  maximize  his  survival  time 
while  Blue  searches  for  him.  Our  other  payoff  matrix,  A,  represents  the  Ambush  Model 
where  Blue  wishes  to  maximize  his  survival  time  while  Red  attempts  to  ambush  him. 
These  two  payoff  matrices  are  then  combined  into  a  bi-matrix  model  (*F,  A)  that 
represents  the  competing  goals  for  each  player  (i.e.,  maximize  individual  survival  time 
while  minimizing  the  opponents). 

The  Search  and  Ambush  payoff  matrices  are  constructed  by  adopting  a  model 
developed  by  Owen  and  McCormick  [3],  Their  manhunting  model  considers  a  fugitive 
who  faces  not  only  the  threat  of  being  captured  directly  but  also  the  threat  of  being  turned 
in  by  the  local  populace  [3].  The  Search  Model  adapts  this  directly  for  Red’s  threat  of 
being  found  through  Blue’s  direct  search  and  Blue’s  efforts  to  uncover  him  indirectly 
(i.e.,  through  the  actions  of  others).  The  Ambush  Model  takes  into  account  the  threat 
Blue  faces  from  a  direct  ambush  by  Red  as  well  as  the  indirect  hazards  that  may  prevent  a 
convoy  from  being  completed  (terrain,  length  of  route,  etc.).  Each  route  presents  four 
initial  probabilities: 

1)  The  probability  Red  is  detected  indirectly  by  a  third  party  (indirect  detection 

(<?)) 

2)  The  probability  Red  is  detected  directly  by  Blue  (direct  detection  (p )) 

3)  The  probability  Blue  fails  complete  the  convoy  because  of  reasons  other  than 
an  ambush  (indirect  hazard  (r)) 

4)  The  probability  Blue  fails  to  complete  the  convoy  because  he  was  ambushed 
by  Red  (direct  hazard  (5)) 

The  rate  by  which  each  initial  probability  increases  is  dependent  upon  the 
intensity  of  the  efforts  of  the  adversary.  The  intensity  of  each  player’s  efforts  to 
minimize  his  opponent’s  survival  time  is  represented  by  the  variables  a,  P,  and  y.  The 
amount  of  money  and  resources  Blue  spends  on  gaining  the  assistance  of  the  local 
populace  to  uncover  Red  affects  how  quickly  the  threat  of  indirect  detection  increases. 
This  rate  of  increase  is  represented  by  a  and  can  be  seen  as  quantifying  the 
aggressiveness  of  Blue’s  efforts  to  indirectly  detect  Red.  The  intensity  of  a  direct  search 

3 


for  Red  (possibly  as  a  function  of  the  numbers  of  Soldiers,  sensors,  etc.  involved) 
determines  how  quickly  the  threat  of  direct  detection  increases  along  a  certain  route  and 
is  represented  by  P,  or  the  aggressiveness  of  Blue’s  efforts  to  directly  uncover  Red.  We 
assume  that  the  quality  of  the  route  (which  affects  the  indirect  threat  to  the  convoy)  will 
not  change  with  repeated  iterations  along  that  route  and  therefore  we  limit  ourselves  to 
only  one  parameter  for  threat’s  rate  of  increase  in  the  Ambush  model.  With  every  use  of 
a  route,  Blue  faces  the  risk  of  Red  moving  onto  that  route  to  intercept  him  on  the  next 
convoy.  The  rate  at  which  this  threat  increases  is  defined  as  y — Red’s  aggressiveness. 

Once  we  construct  the  individual  payoff  matrices,  we  construct  the  bimatrix 
model  (*f*.  A)  where  each  cell  contains  the  pair  of  values  from  the  respective  individual 
payoff  matrices.  Using  non-linear  programming,  we  determine  the  optimal  route 
selection  strategies  for  each  player.  We  show  that  these  optimal  strategies  (also  known 
as  Nash  Equilibria)  are  dependent  upon  the  desired  survival  time  for  each  player,  which 
can  be  viewed  as  each  player’s  decision  to  be  risk  averse  or  risk  prone.  These  optimal 
strategies  are  then  used  to  determine  which  routes  a  convoy  should  take,  as  well  as  which 
routes  a  patrol  can  most  expect  to  uncover  the  enemy. 
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II.  SEARCH  MODEL 


A.  OVERVIEW 

We  use  a  model  created  by  Owen  and  McConnick  [3]  to  determine  the  expected 
time  until  capture  (or  ambush)  for  each  player.  For  the  benefit  of  the  reader,  and  to  add 
our  own  notation  that  will  be  useful  later  in  analysis,  we  will  cover  it  again  in  Chapters  II 
and  III  of  this  thesis.  For  our  purposes,  we  define  “indirect  detection”  as  the  threat  Red 
faces  of  being  discovered  indirectly  and  “direct  detection”  as  the  threat  he  faces  from 
being  found  directly  by  Blue. 

1.  Indirect  Detection 

The  local  populace  of  a  region  can  pose  a  risk  to  Red  and  shorten  his  survival 
time.  Blue  can  take  advantage  of  this  in  a  number  of  ways.  He  could  offer  greater 
rewards  for  revealing  Red  or  he  could  gain  the  confidence  of  the  local  population  so  that 
it  is  in  their  best  interest  to  betray  Red.  We  can  safely  assume  that  as  Red  occupies  an 
area  the  risk  of  discovery  increases.  Assuming  that  Blue  is  looking  on  another  route  for 
Red,  Red’s  probability  of  detection  and  capture  within  t  units  of  time  on  route  i  will  be 
represented  by  Qi(t)  and  the  probability  he  has  not  already  been  captured  is  1  -  Q,(t) . 

This  probability  of  capture  is  not  static  but  rather  increases  with  time.  We  let  gi(t) 

represent  the  rate  at  which  this  probability  increases  along  route  i  while  Red  is  on  it. 
Making  the  additional  assumption  that  Red’s  risk  is  initially  zero  when  he  enters  the  route 
we  get  the  following  differential  equation  and  its  solution. 


Qi\t)=gim-Qm 

0/(0)  =  o 

Q(t)  =  \  -exp  {-G(0} 


(1) 


Note  that  the  derivative  of  Qf(t)  is  the  increase  in  the  probability  of  detection.  It 
is  the  probability  that  Red  has  not  been  detected  multiplied  by  the  rate  of  increase  for 


5 


that  route.  We  further  define  g.  as  a  linear,  strictly  increasing,  unbounded  function  with 
the  initial  probability,  qt ,  that  Red  will  be  captured  on  that  route  plus  some  linear  rate  of 
increase,  ai  >  0 ,  that  controls  how  quickly  that  probability  increases  with  time.  Owen 

and  McCormick  used  a  constant  a  =  0.01  in  their  examples  as  a  reasonable  rate  of 
increase.  In  the  case  of  our  game,  we  can  assign  a  value  slightly  larger  if  Blue  is  very 
aggressively  pursuing  indirect  means  of  detection  along  that  route.  Clearly,  the  value  can 
change  for  each  route  based  on  Blue’s  efforts.  Gt  is  the  anti-derivative  of  gi  evaluated 
from  zero  to  time  t. 

This  gives  us 

g.(t)  =  qi+ait 

ft  oc  t2  (2) 

Gi(t)  =  £  gl(s)ds  =  lq,  +  -y- 

Applying  (2)  to  the  solution  in  (1)  we  obtain 
G(0  =  l-exp{-Gf(0} 

Qi  \t ) = g,  m  -  a  (0) = §>  (o  ^p  {-Gt  (0} 

Red  wants  to  maximize  his  survival  time,  so  we  can  assume  that  at  some  time  T 
he  will  decide  to  move,  if  he  has  not  already  been  captured.  Keep  in  mind  that  Q(T)  is 
the  probability  Red  is  captured  by  time  T,  l-Q(T)  is  the  probability  that  Red  will  be  able 
to  move  at  time  T  (i.e.,  he  is  not  captured).  We  will  assign  the  random  variable  X  to 
represent  the  time  Red  spends  in  route  i  before  moving.  In  determining  this  time,  we 
must  take  into  account  not  only  the  probability  he  will  be  able  to  move  at  that  time,  but 

also  the  probability  he  has  not  been  captured  before  time  T  given  the  density  ^  from  0 
to  T.  This  gives  us  the  formula  for  our  expected  time  Red  spends  on  route  i. 

E[Xt]  =  \TotQ(t)dt  +  (1  -  Q(T))T  (4) 

We  can  simplify  this  even  further  using  (3) 

E[Xt  ]  =  JQr  tg,  (0  exp  {-G,  ( t)}dt  +  T  exp{-G,.  (T)}  (5) 
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Red’s  move  does  not  come  without  risk.  He  can  always  be  detected  en  route  to 
his  next  ambush  location.  Independent  of  when  he  moves,  he  expects  to  survive  an 
additional  Vi  units  of  time  after  starting  the  move.  Given  that  the  probability  that  he  will 

even  get  to  move  is  exp{-G(.(t)},  we  can  define  his  total  expected  survival  time  on  route 
i  as 

4  =  JQr  tg,(t)  exp{-G,.(0}^  +  rexp{-G,.(r)}  +  exp{-G,.(r)}C  (6) 


Clearly  Red’s  decision  to  depart  is  based  upon  his  desire  to  maximize  the  time  T 
of  his  departure.  To  maximize  this  we  differentiate  At  with  respect  to  T. 
dA 

-4  =  Tg,  (T)  exp  {-Gl  (T)}  +  exp  {-G,  (T) }  -  Tgi  ( T )  exp  {-G,  (r)}  -  g,  (T)  exp  {-G,  (T)}Vt 
dT 

When  simplified  we  get 


dA 

— ^  =  exp  {-G.  (T)}  -  gf  ( T )  exp  {-G,  (t)}^ 
dT 


(7) 


Setting  this  to  zero  and  solving  for  T  produces  the  following 

r=gr‘4> 


(8) 


Since  g,  is  a  strictly  increasing  function,  we  can  be  assured  that  T  is  unique.  Now 
we  notice  that  we  can  simplify  equation  (6)  if  we  integrate  by  parts  letting  u  =  t  and 
dv  =  g(t)*exp  {-G  (t)}. 

Note:  £  tgi  (t)  exp  |-G(  (t)}dt  =  -T  exp {-G(T)}  +  £  exp{-G,(f)}* 

So  Red’s  expected  time  to  indirect  detection  along  route  i  becomes 

4  =  jor  exp  {-G,  (0  }dt  +  exp  {-G,  (T))Vi  (9) 

Keeping  in  mind  how  we  defined  gi  we  can  take  equation  (8)  and  rewrite  it  as 

rp  1 

<h+(XiTi=  — 

Solving  for  T  we  get 


T  = 


aty, 


(10) 
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We  can  see  that  71  has  the  potential  of  being  a  negative  number  or  zero.  Since  we 
assume  that  aj,Vi  are  positive,  this  can  only  occur  when  the  initial  probability  qt  is 
sufficiently  large  sofl  <  qt  1 .  Such  a  cell  would  present  a  significant  risk  to  Red,  and 
offer  no  gain  in  his  expected  survival  time.  Red  would  immediately  move,  if  he  found 
himself  on  such  a  route.  We  will  therefore  choose  T  by  (10),  if  it  is  positive  and  set 

T.  =0  otherwise.  Intuitively  this  makes  sense.  If  Red  were  to  move  into  the  Green  Zone 
in  Iraq,  his  initial  probability  of  detection  would  be  so  high  that  he  would  immediately 
move  to  another  location.  We  can  also  see  that  by  Blue  increasing  ai  he  forces  Red  to 
move  more  frequently  and  risk  in  transit  detection  more  often. 

Going  back  to  (2)  and  using  the  integral  of  G(t)  we  get  the  following: 

exp  {-Gt  (t)  \  =  exp  {-qf  - 


exp  +  — )2+y^- 

2  a,  2a, 


=  exp  exp  {- °^~  (t  +  %2 } 

2  a,  2  a . 


Making  this  substitution  into  (9)  produces  the  following 

2 

4  =  exp  {^f  f  exp  {-  ^  «  +  %2  }*  +  exp  {-G, 


a , 


(ID 


By  letting  u  =  (t  +  —).  —  (10)  becomes 


a: 


W 


2  ru2 


4  =  exp (4M J— f  2  exp {-u  )dt  +  exp {-Gi (T))Vi 
2a :  v  oc.  Jui 


(12) 


Where  our  upper  and  lower  limits  of  integration  becoming 


U 


1  i 


—  and  U2i 


(13) 
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A  Cx  9 

Keeping  in  mind  the  error  function  erf(x)  =  —j=  exp  {-y  }dy  we  can  then  obtain  At  in 

V7T  Jo 


tenns  of  it. 


cxp{^'  )4n[erf (U2i)-erf (U Vi)} 

4  = - S - /= - +  exp{-G((rf)}F, 

4lai 


(14) 


2.  Direct  Detection 

We  define  direct  detection  as  the  threat  Red  faces,  if  he  is  located  on  the  same 
route  as  Blue.  In  reality,  Blue  can  directly  search  on  multiple  routes,  but  for  this  model 
Blue’s  direct  search  is  limited  to  the  route  the  convoy  is  on.  Clearly,  if  Red  happens  to  be 
on  the  same  route  as  Blue,  he  will  face  two  risks:  1)  the  risk  of  being  “given  up”  and  2) 
the  risk  of  being  found  by  Blue  before  he  can  ambush  the  convoy.  The  probability  that 
Red  will  be  detected  in  a  direct  search  by  time  t  is  given  by 

R(t)  =  l-exp{-F(t)}  (15) 

We  make  the  assumption  that  Blue’s  level  of  aggressiveness  associated  with 
directly  finding  Red  will  be  the  same  regardless  of  which  route  Blue  is  on.  If  the  physical 
characteristics  of  the  route  or  other  limitations  violate  this  assumption,  we  will  need  to 
differentiate  this  level  of  aggressiveness  as  we  did  in  the  indirect  detection  parameter  a. 
Keeping  this  in  mind,  we  define  the  forcing  function  for  the  risk  of  direct  detection  as  the 
following 


f(t)  =  p  +  pt 

C<  Bt 2  (16) 

F(t)  =  \j(s)ds  =  tp  +  — 

where  p  is  the  initial  probability  that  Red  will  be  found,  if  he  is  on  the  same  route  as 
Blue’s  recon  and / 3  (Blue’s  direct  aggressiveness)  is  the  rate  at  which  that  risk  increases, 
as  long  as  he  remains  on  the  same  route  as  Blue. 


9 


Assuming  that  Blue  and  Red  are  on  the  same  route,  we  will  define  the  time  at 
which  Red  decides  to  move  as  Z>  with  his  expected  survival  time  after  leaving  route  i 
still  as  Vt .  Red’s  total  expected  survival  time,  including  the  time  he  spends  on  the  same 


route  as  Blue  is  therefore  given  by 

B<  =  I,0 # ^ exp A  A Wdt  +  D'  exp A  A (A)}  +  exp (~F (A  Wi  ( 1 7) 


Further  refining  this,  as  we  did  in  the  case  with  indirect  detection,  we  obtain  the 
following 


B=^V{-FfDi)}Vi  + 


r,  exp  { 0  }  [erf  (W2i )  -  erf  (Wu  )] 

Va? 


(18) 


And,  as  before,  our  limits  of  integration  become 

andW,,=— £ '=+D„ 


”  A? 

In  the  same  manner  as  we  derived  T.  (10),  we  get  the  following  result  for  when 


(19) 


Red  will  leave  the  route  he  is  on,  if  Blue  is  directly  searching  there 
1  -nV 

D  =  '  (20) 

pv, 

We  can  make  the  reasonable  assumption  that  the  indirect  probability  q  will  be  less 
than  the  direct  detection  probability  p.  Here  we  make  the  assumption  that  the  rate  of 
increase  in  a  direct  search,  /?,  will  be  equal  or  greater  than  the  rate  of  increase  in  a 
indirect  search,  a  .  Put  another  way,  Blue’s  patrols  will  be  more  aggressive  and 
therefore  more  likely  to  find  Red  than  any  effort  to  have  a  third  party  uncover  Red’s 
location.  (There  may  be  cases  where  the  inhabitants  of  an  area  will  be  more  effective 
than  Blue  at  capturing  Red,  but  then  Red  would  avoid  these  areas,  as  they  give  him  no 
advantage.)  Since  p  >  q  and  /?  >  a  it  is  apparent  that  D  <Tt  and  Red  will  always  depart 


more  quickly,  if  Blue  is  directly  searching  on  the  same  route  as  him.  As  with  T. ,  D  has 


the  potential  of  being  negative.  In  that  event,  we  will  let  Z)  =  0 . 
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3. 


Stochastic  Process 


As  noted  by  Owen  and  McCormick,  this  is  clearly  a  stochastic  game  in  which 
Red’s  survival  time  is  dependent  upon  how  often  he  is  allowed  to  move.  To  denote  this, 
we  will  use  A'"  where  m  represents  the  number  of  times  Red  is  allowed  to  move  and  i  is 
the  route  that  he  starts  in  and  the  assumption  is  that  Blue  is  searching  on  another  route.  If 
Blue  is  on  the  same  route  as  Red,  we  will  represent  Red’s  expected  survival  time  as 
before  with  B'" .  After  he  moves,  Red  will  be  able  to  move  m-1  additional  times  and  he 

expects  his  remaining  survival  time  to  be  Vf~‘ . 


Considering  the  case  when  Red  is  not  allowed  to  move  we  let  F. 1  =  0  .  Red  will 

stay  on  the  route  until  he  is  captured.  In  this  case,  T  becomes  infinite  and  our  upper 
bound  on  equation  (12)  goes  to  infinity  along  with  G(x).  This  leads 
Erf  (go)  =  1  and  lime'  =  0  and  equation  (14)  becomes 


(21) 


If  we  carry  this  out  recursively  we  see  that  the  general  form  of  (2 1 )  becomes 
exp  \q'  )4n[erf  (U  2i )  -  erf  (Uu )] 

Am= - S - 7= - +  cxp{-Gi(Ti)}Vrl  (22) 

f2a, 

Applying  this  to  the  time  when  Red  departs  the  route  we  get 

i  TT-m-l 

T.  =  max(  "C,  .0)  (23) 

Ti 

Similarly,  we  derive  B'“  in  the  following  manner 


exp  \Pl\  \erf  (W2i )  -  erf  (Wu  )] 

K  = -  P  ^ - +  exp  {-Ft  (A  )}Vr 


with 


(24) 


A  =  max( 


I zlEL 
pvm  1 


-,0) 


(25) 
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Finding  the  expected  time  until  capture  along  each  route,  assuming  Red  is  not 
allowed  to  move  (i.e.,  m  =  0),  is  relatively  simple.  For  m  >  0  we  must  take  into 
consideration  the  time  Red  expects  to  gain  from  moving.  This  is  done  by  calculating  a 
the  expected  time  till  detection  for  the  route  he  is  planning  to  move  to  by  the  probability 
that  he  will  successfully  complete  the  move. 

As  Red  moves  from  route  i  to  route  j,  there  is  the  probability  p{j  that  he  will 

complete  his  move  without  being  captured.  (We  can  think  of  P  as  being  a  symmetric 
matrix  made  up  of  these  probabilities  based  on  the  distances  between  routes  with  zeros 
along  the  diagonal.)  Red  will  then  survive  an  additional  A.  or  B .  when  he  arrives.  If 

Red  moves  to  a  route  where  Blue  is  not  directly  searching,  he  can  be  expected  gain  an 
additional  rtj  units  of  time  if  he  survives  the  move 

hi  =  P,jAj  (26) 

Similarly,  if  Red  moves  to  a  route  where  Blue  is  directly  searching  he  can  expect  to  gain 

c Tj  units  of  time  if  he  survives  the  move 

CTj  =  pjBj  (27) 

Focusing  on  a  single  route  I,  we  can  see  the  expected  gain  in  survival  time  Red 
may  obtain  from  moving  to  a  different  route  j.  By  ordering  our  v  in  decreasing  order 

(i.e.  >  r2  >...>  rn)  we  can  use  the  following  algorithm  developed  by  Owen  and 

McConnick  to  determine  the  increase  in  survival  time  after  the  move,  Vi ,  from  route  i. 


Algorithm. 

1. 

2. 

3. 

4. 

5. 

6. 


4=1 


1 


-cr, 


and  Vk  = 


4 


l 


Let  k  =  n  (the  number  of  routes) 
Let  v  =  Vk,  computed  using  (28) 
If  v<  rk ,  proceed  to  step  6 
I f  v  >  zk ,  let  k=max  {j  |  r;.  >  v  } 

Return  to  step  2 

Compute  x*  and  y*  using  (29) 


(28) 
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Solving  for  (A°,B°)  then  V°  allows  us  to  solve  for  (A\ ,  B\ ),  D),T.  then  V.  . 

Doing  this  iterative  process  allows  us  to  find  the  expected  times  until  capture  for 
increasing  values  of  m,  the  number  of  times  we  allow  Red  to  change  routes. 

Owen  and  McCormick  discovered  that  while  computing  these  quantities  for 
increasing  values  of  m  their  values  change  very  little  after  approximately  m  =10.  In  their 
paper  they  proved  convergence  with  the  assumption  that  some  risk  is  incurred  every  time 
Red  moves  (i.e.  />.  <  1 ).  The  greater  the  pVj ,  the  faster  the  expected  times  converge.  In  a 

risky  environment,  it  should  not  be  necessary  to  compute  for  values  of  m  greater  than  10 
as  it  is  unlikely  Red  can  look  more  than  5  or  6  steps  in  advance  to  see  where  he  should 
move. 

Coincidently,  if  we  were  just  concerned  with  Red’s  attempt  at  survival,  we  can 

compute  the  optimal  strategies  for  Red’s  use  of  routes  and  Blue’s  strategy  for  finding 

*  * 

him.  Given  that  the  game  has  a  value  vk,  we  can  compute x  ’  k  ,  the  optimal  strategies 
for  Red  and  Blue  respectively,  in  the  following  manner 

x*  =  — — —  for  1  <i  <k,  and  0  for  k  + 1  <  i  <  n 

Y"7'  (29) 

y*= - - —  for  1  <i  <k,  and  0  for  k  + 1  <  /  <  n 

Ti  ~  ai 

A  benefit  of  this  model  is  its  ability  to  determine  where  Red  is  most  likely  to 
move  to  when  he  does  decide  to  move.  Owen  and  McCormick  further  explored  this 
property  with  the  assumption  that  we  know  the  last  location  Red  occupied  [3].  We  will 
not  pursue  the  same  analysis  in  this  model.  If  we  knew  that  Red  moved,  and  from  which 
route,  it  gives  us  a  route  to  use  without  fear  of  ambush,  thus  invalidating  the  need  for 
further  analysis  for  route  selection.  It  is  worthy  to  note  this  property  in  the  event  Blue 
does  learn  of  Red’s  last  position  and  wishes  to  send  a  patrol  out  to  uncover  him. 
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B. 


ANALYSIS  OF  SEARCH  MODEL 


1.  Decreasing  Expected  Time  to  Detection  for  a  Single  Route 

We  make  the  assumption  that  Red  can  move  multiple  times  to  both  increase  his 
chances  of  attacking  Blue  and  to  avoid  detection.  Therefore,  the  case  A°  and  B°  do  not 

concern  us  other  than  to  establish  the  semi-steady  state  for  A™  and  B’n .  Blue’s  goal  is  to 
drive  7)  and  D  to  zero  in  order  to  make  that  route  unattractive  to  Red  and  safer  for  the 
convoy.  Keeping  in  mind  equations  (23)  and  (25),  simply  increasing  the  Blue’s  indirect 
aggressiveness  (a.)  or  direct  aggressiveness  (J3)  will  decrease  Red’s  expected  time  to 

detection  but  it  will  not  get  it  to  zero.  The  ultimate  goal  is  to  drive  the  values  qiVim~1  >  1 
and piVim~1  >  1 .  Doing  so  brings  Red’s  departure  times  to  zero,  and  he  moves  as  soon  as 
he  finds  himself  on  that  route.  Blue  can  accomplish  this  by  placing  all  of  his  effort  onto 
the  route  with  the  largest  initial  probability  of  indirect  or  direct  detection.  Since  we  are 
dealing  with  a  single  route,  we  will  focus  on  direct  detection  from  here  on.  Since  we  are 
dealing  with  only  one  route  the  value  then  becomes  B°  (keeping  in  mind  that  V  =  0  and 

letting  k=l  for  equation  (26))  with  the  additional  assumption  that  Red  will  be  on  that 
route  to  ambush  Blue  and  neither  will  switch  to  another  route.  Blue  is  then  left  with 
trying  to  drive  B°  to  zero.  How  aggressive  must  the  patrols  be  ( ft)  to  make  this  happen? 
Taking  equation  (21)  and  applying  it  to  B 11  gives  us 

exP{fyV41  -erf(Wu)\ 

B°= - - = -  (30) 

Evaluating  (27)  shows  us  that  B°  can  never  be  zero  since  J3,  pj  are  always  positive.  The 
only  way  to  reduce  the  value  of  the  route  is  to  increase  Blue’s  direct  aggressiveness  (/?) 
and  the  payoff  for  this  effort  decreases  exponentially.  This  is  what  we  have  come  to 
intuitively  understand  and,  although  we  are  not  limiting  ourselves  to  a  single  route,  it  can 
help  us  see  our  diminishing  rate  of  return  on  effort  along  a  single  route.  From  this 
foundation  we  will  shift  our  analysis  to  multiple  routes. 
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2.  Decreasing  Expected  Time  to  Detection  for  Multiple  Routes 

As  with  Ruckle’s  work  on  the  geometric  approach  to  the  game  [2],  we  will 
assume  Blue  takes  a  single  route  to  his  destination  (i.e.,  straight  line).  We  will  also  make 
the  assumption  that  the  routes  are  independent  of  each  other  (i.e.,  they  do  not  cross).  This 
is  important,  because  if  the  routes  intersected  at  a  common  point,  then  it  becomes  a  game 
with  one  route  at  that  point.  We  can  then  study  the  routes  independent  of  each  other. 

Using  the  above  model,  what  must  Blue  do  to  secure  a  route?  Clearly  if  he  is  very 
aggressive  (both  indirectly  and  directly  ( a[  and  /3 ))  he  can  then  bring  all  Ai  and  Bj  close 

to  zero,  but  at  the  expense  of  spending  greater  resources.  We  have  already  established 
that  they  can  never  be  zero  because  of  the  exponential  nature  of  the  risk.  The  best  Blue 
can  do  is  to  drive  the  time  at  which  Red  will  depart  a  route  ( Tt  and  Z) )  to  zero  so  that  Red 

will  move  immediately  away  from  that  route  if  he  is  on  it.  If  he  is  able  to  accomplish 
this,  then  the  equations  (22)  and  (24)  become: 

A;"  =  exp {-q.jVr1  and  B, m  =  exp {-p^ 

Let  us  first  focus  on  how  Blue  might  get  these  departure  times  ( T.  and  D  )  to  be 
zero.  By  increasing  his  indirect  aggressiveness  (a) ,  Blue  can  bring  expected  survival 
time,  if  Blue  is  not  on  the  route  ( A )  closer  to  the  expected  survival  time  if  Blue  was  on  the 
route  ( B )  and  the  value  of  that  route  goes  to  B.  As  the  value  of  route  is  more  dependent 
on  A,  Blue  gets  a  greater  payoff  in  the  reduction  of  the  route’s  value  by  being  indirectly 
aggressive  (a)  but  this  does  not  get  Red’s  expected  survival  time  any  lower  than  if  Blue 
was  on  that  route  ( B ).  To  reduce  Red’s  expected  survival  time  more,  and  ultimately  to 
get  it  to  zero,  Blue  must  focus  his  effort  on  directly  finding  Red.  This  suggests  that  Blue 
must  dramatically  increase  his  direct  aggressiveness  (ft)  for  this  to  happen.  However, 
Blue  has  another  option. 

As  already  mentioned  the  value  of  a  route,  in  the  presence  of  other  routes,  ( V”~l ) 
must  be  significantly  large  enough  so  that  qiVjm~1  >  1  and/or  >  1 .  To  do  this,  Blue 

simply  has  to  add  routes  under  consideration.  However,  as  Blue  adds  routes  and 
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increases  V”  1  the  value  of  Red’s  expected  survival  time  with  Blue  on  the  route  ( B” ) 
goes  up.  To  strike  a  balance  between  the  two,  we  find  that  Vm~l  =  —  provides  us  the 

minimum  value  we  need  a  route  to  be  so  as  to  bring  the  time  when  Red  moves  ( Dj  )  to 
zero  (likewise  for  T  and  V"  1  =  — ).  To  minimize  the  expected  survival  time  for  Red,  if 

Blue  is  on  the  route  Blue  must  now  adjust  his  efforts  too  so  that  V”  1  =  —  and  no 

more.  In  reality,  Blue  will  want  to  provide  enough  “useable”  routes  where  Red  will 
move  immediately,  if  he  discovers  that  Blue  is  on  them  (i.e.,  D.  =  0  )  while  accepting  the 

increase  in  B"1  so  that  he  can  randomize  which  route  he  takes.  For  the  purposes  of  this 

game,  we  will  only  consider  those  routes  that  Blue  is  considering  using  and  is  currently 
exerting  effort  (either  indirect  or  direct)  to  find  Red.  (Remember  that  F  (t)  and  G  (t)  must 
be  strictly  increasing).  In  other  words,  Blue  wants  to  provide  enough  viable  routes  for 
Red  to  choose  from  and  hide  on  while  reducing  the  attractiveness  of  certain  routes.  This 
may  not  always  be  possible,  since  Blue  often  has  only  a  finite  number  of  routes  to  choose 
from  and  must  make  the  most  of  what  is  available.  In  addition  to  limited  routes,  the 
enemy  also  gets  a  vote.  Red,  being  an  intelligent  player,  will  try  to  overcome  Blue’s 
efforts  to  avoid  him  by  applying  his  own  ambush  model. 
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III.  AMBUSH  MODEL 


A.  OVERVIEW 

No  matter  how  unattractive  Blue  makes  a  route,  if  he  continues  to  utilize  that 
route,  Red  will  be  tempted  to  move  onto  that  route  and  ambush  Blue.  Red  will  risk  direct 
detection  to  achieve  his  own  goals.  Blue’s  probability  of  encountering  a  hazard,  either 
indirect  or  direct,  as  he  continues  to  utilize  the  same  route  convoy  after  convoy  will 
increase.  We  assume  that  Red’s  efforts  to  intercept  Blue  will  remain  constant  regardless 
of  route  and  represent  it  with  the  variable  y  .  If  this  assumption  is  not  valid,  we  will  have 
to  differentiate  Red’s  aggressiveness  by  route,  as  we  did  for  a.  The  more  aggressive  Red 
becomes  the  greater  the  value  of  y . 


1.  Indirect  Hazard 


Every  convoy  runs  the  risk  of  not  completing  the  journey  regardless  of  enemy 
action.  This  could  be  the  result  of  treacherous  terrain  (think  of  Hannibal’s  journey  across 
the  Alps)  or  a  longer  route  that  can  result  in  more  breakdowns.  With  this  in  mind,  we 
define  r  as  our  initial  probability  of  success  for  Blue  crossing  route  i.  Note  that  this 


model  takes  the  same  form  as  our  model  for  Red’s  threat  of  indirect  detection.  As  we  did 
there,  we  will  start  by  defining  S(t)  to  be  the  probability  of  an  unsuccessful  completion  of 
a  convoy. 

S(0  =  l-exp{-tf,(0}  (31) 

where  our  strictly  increasing  risk  is  defined  by 

hi(t)  =  ri+yt 

rt  yt 2  (32) 

HM)  =  \fhiU)ds=tr+  — 

Following  the  same  derivation  we  have  used  before,  the  expected  survival  time 
for  Blue  along  route  i,  assuming  he  can  change  routes  m  times,  is: 


Cm  = 


exp  Jx[erf  (K2i )  -  erf  (Ku )] 

2  r _ 

V2 y 


+  exp  {-Hi 


m- 1 


(33) 
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where  the  limits  of  integration  are  given  by 


K 


1/ 


and  K2i 


and  the  expected  time  of  departure  from  route  i  is 

1  -r.V 

M.  =  max( - — ,  0) 

yVt 


(34) 


(35) 


2.  Direct  Hazard 


Different  routes  provide  Red  with  a  greater  advantage  of  ambushing  Blue.  Some 
routes  provide  more  than  adequate  cover  for  Red  to  hide  or  choke  points  for  him  to 
utilize.  Given  this  condition,  we  will  assign  the  initial  probability  of  a  successful  ambush 
to  each  route  as  si  and  our  strictly  increasing  risk  is  defined  by 


j,(t)=sl+yt 


J,(t)  =  j0j,(z)dz  =  tsi  + 


yt- 


This  leads  us  to  define  our  expected  time  of  survival  for  Blue  on  route  i  as 

exp  \4n[erf  (L2i )  -  erf  (Lu )] 

K  = - ^ - -p= - +  exp  {-JfNWr 

where  the  limits  of  integration  are  given  by 


(36) 


(37) 


4  and L2i  j-  +  Niyj2 

and  the  expected  time  of  departure  from  route  i  is 
\-sV 

Nj  =  max( — f-i- ,  0) 

rK 


(38) 


(39) 


3.  Stochastic  Process 

We  use  the  same  method  as  we  did  for  the  Search  Model  in  developing  the 
expected  time  to  ambush  for  Blue  on  each  route  given  that  Blue  can  change  routes  m 
times.  If  m  =  0,  then  our  expected  times  for  indirect  or  direct  hazard  respectively  are: 
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(40) 


C°  = 


As  with  Red,  Blue  will  face  some  risk  by  changing  routes  such  as  a  greater 
distance  to  cover.  We  will  continue  to  use  the  terms  tau  and  sigma  to  represent  the 
expected  gain  in  time  by  moving  onto  route  j,  depending  on  whether  Red  is  present  or  not 
(respectively),  but  we  will  use  0  to  represent  our  matrix  of  completion  probabilities,  as 
we  did  for  P  in  the  previous  model. 


If  Blue  moves  to  a  route  where  Red  is  not  directly  searching,  he  can  expect  to 
gain  an  additional  r(/  units  of  time  if  he  survives  the  move 


*j  =  Ofj  (41) 

Similarly,  if  Blue  moves  to  a  route  where  Red  is  directly  searching  he  can  expect  to  gain 


a  units  of  time  if  he  survives  the  move 


CTj  =  0yEj  (42) 

Then  by  using  (28)  we  can  determine  the  value  of  each  route.  Using  this  we  can 
iteratively  find  the  expected  survival  time  along  each  route  by  increasing  m  until  we  see  a 
stable  time  appear  while  using  equations  (33)  and  (37).  We  should  keep  in  mind  that  the 
computation  of  the  expected  survival  time  after  leaving  route  i  (V)  is  done  as  before  in 
equation  28  with  the  expected  times  till  indirect  hazard  being  sorted  in  decreasing  order 
(i.e.  >  r2  > ...  >  rn )  .  If  we  wanted  to  see  what  Red  and  Blue’s  optimal  strategies  are  in 

the  absence  of  the  search  model,  we  can  use  equation  29.  Note,  however,  that  in  this 
case  x*  is  Blue’s  optimal  strategy,  since  he  is  the  row  player  and  y*  is  Red’s  optimal 
strategy,  since  he  is  the  column  player. 


B.  ANALYSIS  OF  AMBUSH  MODEL 

1.  Increasing  Expected  Survival  Time  on  a  Single  Route 

As  we  saw  in  the  analysis  of  the  Search  Model,  restricting  ourselves  to  a  single 
route  reduces  the  value  of  the  game  to  that  of  the  expected  survival  time  associated  with  a 
direct  hazard.  Again,  this  value  can  never  be  zero  but  gets  exponentially  closer  to  zero 
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with  increasing  levels  of  effort  (y)  on  the  part  of  Red  to  increase  the  probability  of 
hazard.  The  challenge  for  Blue  now,  if  he  wishes  to  make  it  through  the  route 
successfully,  is  to  ensure  his  expected  survival  time,  E° ,  is  greater  than  Red’s  expected 
survival  time,  B° ,  along  that  route.  He  can  do  this  by  making  sure  Red’s  probability  of 
detection,  p  ,  is  greater  than  his  initial  probability  of  a  hazard,  r ,  and  that  his  efforts  to 
increase  the  rate  of  detection,  ft ,  are  greater  than  Red’s  efforts  to  increase  the  linear  rate 
of  interference,  y . 

Of  course,  intuitively,  this  is  what  one  expects.  If  the  route  is  a  highway  with 
clear  fields  of  view,  and  Blue  actively  sends  recons  up  and  down  the  route,  then  he  is  apt 
to  survive  longer  along  that  route  than  Red.  The  converse  is  also  intuitive.  If  a  route 
goes  through  an  area  where  Red  can  easily  hide,  and  Red  is  very  aggressive  along  that 
route,  then  Blue’s  survival  time  will  be  lower  than  that  of  Red.  Give  the  two  options, 
Blue  should  always  choose  the  former;  thus  leaving  Red  only  the  option  of  significantly 
increasing  his  efforts  y  to  bring  E°  closer  to  B" .  Given  enough  time  though,  Blue  will 
be  ambushed  along  that  route.  Blue  can  reduce  this  risk  by  adding  routes  to  choose  from. 
This  is  especially  true  if  the  routes  are  not  significantly  favorable  to  Blue. 

2.  Increasing  Expected  Survival  Time  on  Multiple  Routes 

In  this  model,  Blue’s  only  influence  over  his  expected  time  of  survival  until 
encountering  a  hazard  is  to  add  more  routes  under  consideration.  As  we  see  in  equations 
(33)  and  (37)  the  expected  survival  time  after  moving  goes  up  as  we  add  more  routes. 
However  this  also  drives  some  of  the  times  till  departure,  (35)  and  (39),  to  zero  as  some 
routes  become  more  favorable.  As  with  the  Search  Model,  it  becomes  disadvantageous 
for  Blue  to  add  more  routes,  if  they  do  not  offer  an  advantage  to  routes  already  under 
consideration.  There  is  also  the  practical  matter  of  having  only  a  finite  number  of 
possible  routes  to  choose  from  in  a  realistic  scenario.  We  will  therefore  limit  our  study  to 
a  few  routes  with  the  understanding  that  Blue  has  chosen  from  the  best  available  to  him. 
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IV.  BIMATRIX  MODEL 


A.  OVERVIEW 

We  have  now  developed  two  models.  In  each  model,  one  player  is  trying  to 
maximize  his  survival  time  while  minimizing  the  other  player’s  survival  time.  Taking 
these  two  models  we  will  define  the  payoff  matrices  in  the  following  manner 

'F  is  the  Search  Payoff  Matrix  where  i//(/  =  {  ^  '*j. 

A  is  the  Ambush  Payoff  Matrix  where  Xtj  =  {  ^  ^  \ 

(note:  From  here  on  we'll  use  the  transpose  of  A  since  we  need  Blue  to  play  the 

columns  of  both  matrices,  but  A  was  formed  with  Blue  playing  the  rows) 

These  payoff  matrices  form  the  basis  for  our  bimatrix  game.  John  Nash’s 
renowned  paper  on  non-cooperative  games  [4]  in  1951  proved  that  for  every  bimatrix 
game  a  pair  of  strategies  exist  that,  if  played  by  both  players,  maximize  the  value  for 
both.  At  this  equilibrium  point,  neither  player  can  obtain  a  greater  value  by  applying  a 
different  strategy  while  his  opponent’s  strategy  remains  unchanged.  If  both  players 
change  their  strategies,  then  either  party,  or  both,  can  obtain  a  greater  value  for  their 
game.. 

Both  Red  and  Blue  can  choose  a  single  route  (a  pure  strategy)  or  they  can 
randomize  their  route  choice  by  assigning  a  probability  to  the  likelihood  that  they  will  use 
it  (a  mixed  strategy).  Let  X  be  the  set  of  all  possible  mixed  strategies  for  Red  and  Y  be 
the  set  of  all  possible  mixed  strategies  for  Blue.  The  expected  survival  time  for  Red  is 
Ar  erf  (x,  y )  =  xTx¥y  for  some  xeX  and  yeY  and  likewise  for  Blue  the  expected 
survival  time  is  EBlue(x,y)  =  xT  A1  y  for  some  xeX  and  y  e  Y .  The  Nash  equilibrium 
is  the  pair  of  mixed  strategies  that  maximize  the  survival  time  for  both  players.  Letting 
x*  and  y*  be  our  optimal  strategies  we  can  state  it  in  the  following  way 

ERed(x,y)  =  xT'¥y*>xT'¥y  VxeX 
^ Blue  (x  ,y  )  =  x  A  y  >x  A  y  V  yeY 
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Once  we  find  x  and  y*  we  define  the  value  of  the  game  for  each  player  as 
VRed  =  x*Tx¥v*  and  VBlue  =  x  T  t\  y  .  The  player  with  the  larger  value  has  the  advantage 
given  the  Nash  equilibrium  identified.  Unfortunately,  there  is  not  always  a  single 
equilibrium  point  in  their  available  strategies. 

1.  Finding  Pure  Equilibrium  Points 

Finding  pure  equilibrium  points  is  relatively  easy.  Looking  at  Red’s  payoff 
matrix,  T*,  we  choose  the  largest  expected  survival  time  in  each  column.  For  Blue,  we 

look  at  the  transpose  of  his  payoff  matrix,  A  7  ,  and  choose  the  largest  value  in  each  row. 
the  locations  (i,j)  where  these  locations  occur  simultaneously  are  called  the  pure 
strategies  where  Red  will  use  route  i  and  Blue  will  use  route  j.  As  an  example,  take  the 
following  payoff  matrices 


'  2 

<5> 

5  1 

"  10 

16 

(l8>~ 

'P  = 

<6> 

3 

(6) 

Ar  = 

(22) 

8 

18 

4 

4 

1 

_{22) 

16 

12 

We  have  labeled  using  <  >  those  column  and  row  entries  that  are  the  largest  for 
Red’s  columns  and  Blue’s  rows  (respectively).  From  the  example,  we  see  that  our 
equilibrium  point  is  met  when  Blue  chooses  route  1  and  Red  chooses  route  2.  The 
advantage  is  clearly  in  Blue’s  favor,  as  he  is  expected  to  survive  longer  than  Red.  This 
equilibrium  should  not  be  any  surprise,  as  our  payoff  matrices  are  constructed  in  such  a 
fashion  that  the  pure  strategy  for  either  player  will  always  be  the  route  that  provides  the 
longest  possible  expected  survival  time,  if  the  adversary  is  not  on  the  route.  Furthennore, 
since  >  Bj  and  C;  >  D  the  only  possibility  of  Red  or  Blue  choosing  the  same  route 

(equilibrium  on  the  diagonal)  is  in  the  event  both  indirect  and  direct  detection/hazard 
times  are  equal.  We  can  intuitively  understand  this  result,  however  unlikely,  as  both  Red 
and  Blue  have  nothing  to  gain  if  they  end  up  on  the  same  route. 
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If  Red  or  Blue  wanted  to  alter  the  expected  survival  times,  they  could  do  so  by 
spending  resources  (increasing  their  aggressiveness)  to  increase  the  rate  of  detection  or 
hazards  (a,  [3  and  y)  thereby  decreasing  the  value  for  their  opponent. 

We  must  note  that  more  than  one  pure  strategy  may  appear  using  the  following 
method.  The  strategy  to  choose  would  be  the  one  that  offers  the  maximum  survival  time 
to  both  players.  If  such  a  strategy  does  not  exist,  or  multiple  pure  strategies  occur  with 
the  same  value,  Red  and  Blue  will  need  to  determine  their  optimal  mixed  strategies.  This 
is  far  more  realistic,  since  Blue  and  Red  will  not  always  play  a  perfect  game  ensuring  that 
they  never  pick  the  same  route.  Blue  and  Red  will  want  to  randomize  their  route 
selection  as  to  not  give  an  advantage  to  the  other  player  in  detecting  them. 

2.  Finding  Mixed  Equilibrium  Points 

Finding  all  possible  mixed  Nash  equilibria  can  be  a  daunting  task.  In  1964, 
Lemke  and  Howson  [5]  showed  how  to  obtain  all  of  the  mixed  equilibrium  points  in  a 
two  person  game  using  non-linear  programming.  Their  algorithm  states  that  the 
strategies  x*  and  y*  are  Nash  equilibria,  if  and  only  if,  they  maximize  the  following  non¬ 
linear  equation  and  constraints  [6]: 

max  xTxPy  +  xT  Ay  -  p-q 

x,y,p,q 

subject  to: 

Wy  <  pJn  Ax  <  qJn  (where  Jn  is  a  n  x  1  vector  of  all  ones) 

Xj  <  0  and  y(.  <0  Vie  (1  ..n)  (44) 

n  n 

2>,=2>7=i 

i= 1  7=1 

(P*  =  Vrcp  and  q  =  VBlue) 

Solving  the  above  problem  is  best  done  using  software.  Barron  [6]  provides  the 
Maple  and  Mathematica  commands  for  setting  up  and  solving  such  a  problem.  There  are 
also  multiple  software  packages  available,  such  as  SNOPT  and  KNITRO,  which  can  be 
used  for  solving  problems  involving  a  large  number  of  routes.  For  the  examples  given 
below  with  relatively  few  routes,  we  will  rely  on  Maple’s  NLPSolve  command  to  find 
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multiple  mixed  Nash  Equilibria.  We  accomplish  this  by  altering  the  starting  point  for  the 
non-linear  program  search  by  varying  the  values  of  p  and  q.  This  is  no  trivial  task  as  the 
upper  bound  on  number  for  equilibria  is  theorized  to  be  2"  - 1  in  an  n  x  n  game  [7]. 


In  a  perfect  world.  Blue  and  Red  will  agree  to  use  a  pure  equilibrium,  and  they 
will  never  meet  on  a  route.  Both  will  choose  the  routes  that  provide  them  the  greatest 
indirect  detection  time  without  being  on  the  route  together.  This  is  unlikely  as  each  will 
be  inclined  to  use  this  predictability  to  their  advantage  and  actively  intercept  the  other.  A 
better  way  to  approach  this  problem  is  to  determine  the  value  of  the  individual  games 
(search  or  ambush)  and  use  these  as  our  starting  points  for  detennining  the  mixed 
equilibrium  strategies  (i .Q.,p  and  q  in  44).  We  can  determine  the  expected  payoffs  of  the 
individual  games  in  the  following  way 

^ Search  Red— Search  Blue-Search 


V 


Ambush 


*T 

X 

Red -Ambush 


A  % 


ue— Ambush 


(45) 


These  two  values,  while  interesting,  do  not  take  into  account  the  dynamics  of  both 
players  being  threatened  while  simultaneously  threatening  their  opponent.  What  they  do 
provide  is  a  starting  point  when  we  apply  non-linear  programming  to  detennine  both 
players’  optimal  strategies  when  faced  with  their  competing  self-interests  of  survival  and 
attack.  Depending  on  the  risk  either  faces  from  moving,  we  may  still  encounter  pure 
strategies  where  Red  or  Blue  decide  to  use  a  single  route  rather  than  run  the  risk  of 
changing  routes  even  if  they  can  shorten  the  expected  survival  time  of  the  other.  Clearly, 
if  the  risk  of  movement  is  not  too  great,  a  mixed  strategy  would  best  benefit  both  players 
as  they  can  randomize  their  route  selection.  In  the  examples  that  follow,  we  will  explore 
several  variations  of  this  game. 

B.  EXAMPLES 

1,  High  Risk  of  Movement  for  Red  and  Uniform  a 

In  this  example,  we  will  examine  the  game  where  Blue  may  choose  from  six 
routes.  The  situation  is  such  that  Red  faces  a  significant  risk  every  time  he  decides  to 
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move — less  than  50%  probability  of  successful  completion  of  the  move.  Blue  has 
relative  freedom  of  movement — greater  than  90%  probability  of  successfully  changing 
routes.  The  rate  of  increase  in  the  probability  of  detection  is  uniform  for  all  players 
(a  =  P  =  y  =  .01) . 


We  list  Red  and  Blue’s  initial  probabilities  of  detection  in  the  following  table, 
along  with  the  variables  assigned  to  them  in  Chapters  II  and  III. 


Route 

p  (Search  Active) 

q  (Search  Passive) 

j  (Ambus  h  Active) 

r  (Ambush  Passive) 

1 

0.  5 

0.3 

0.4 

0.1 

2 

0.7 

0.4 

0.6 

0.1 

3 

0.4 

0.1 

0.3 

0.2 

4 

0.3 

0.2 

0.4 

0.1 

5 

0.2 

0.1 

0.5 

0.2 

6 

0.5 

0.1 

0.3 

0.1 

Table  1 .  Example  I — Probabilities  of  Ambush  /  Detection 


Our  first  step  is  to  determine  the  expected  time  to  capture  /  ambush  assuming  that 
neither  Red  or  Blue  are  allowed  to  change  routes  ( m  =0).  Doing  so  using  the  Search 
Model  gives  us 


m=0 

i 

2 

3 

4 

5 

6 

4° 

3.05 

2.37 

6.56 

4.21 

6.56 

6.56 

4° 

1.93 

1.40 

2.37 

3.05 

4.21 

1.93 

Table  2.  Example  I — Red’s 


Expected  Survival  Time  with  No  Moves 


Using  this,  we  can  then  start  to  determine  the  value  of  the  routes  and  the  time  Red 
will  stay  on  each  route.  To  compute  how  much  extra  time  Red  expects  to  gain  from 
moving  from  route  i  to  route  j,  we  need  to  define  the  risk  he  faces  during  the  move. 
Using  (26),  (27),  (41),  (42)  and  Red  and  Blue’s  probability  of  successfully  changing 
routes,  given  respectively  by  the  matrices  P  and  ©below,  we  detennine  that  letting  m  =  4 
we  reach  stability  in  that  the  values  for  A,  B,  T,  D,  and  V  converge  to  within  the  first  two 
decimal  places. 
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p 

1 

2 

3 

4 

5 

6 

1 

0 

0.452 

0.409 

0.452 

0.435 

0.402 

2 

0.452 

0 

0.452 

0.435 

0.452 

0.435 

3 

0.409 

0.452 

0 

0.402 

0.435 

0.452 

4 

0.452 

0.435 

0.402 

0 

0.452 

0.409 

5 

0.435 

0.452 

0.435 

0.452 

0 

0.452 

6 

0.402 

0.435 

0.452 

0.409 

0.452 

0 

Table  3.  Example  I — Red’s  Probabilities  for  a  Successful  Moves 


0 

1 

2 

3 

4 

5 

6 

1 

0 

0.98 

0.96 

0.94 

0.93 

0.92 

2 

0.98 

0 

0.98 

0.96 

0.94 

0.93 

3 

0.96 

0.98 

0 

0.98 

0.96 

0.94 

4 

0.94 

0.96 

0.98 

0 

0.98 

0.96 

5 

0.93 

0.94 

0.96 

0.98 

0 

0.98 

6 

0.92 

0.93 

0.94 

0.96 

0.98 

0 

Table  4.  Example  I — Blue’s  Probabilities  for  a  Successful  Move 


The  values  of  A,  T,  B,  D,  and  V  all  correspond  to  the  equations  given  in  Chapter  II. 


m=4 

1 

2 

3 

4 

5 

6 

Ai 

3.05 

2.45 

6.56 

4.21 

6.56 

6.56 

rj-i  4 

13.83 

1.04 

34.97 

23.08 

40.76 

33.76 

b; 

2.28 

2.44 

2.38 

3.05 

4.21 

2.29 

D? 

0 

0 

4.97 

13.08 

30.76 

0 

V, 4 

2.28 

2.44 

2.22 

2.32 

1.97 

2.29 

Table  5.  Example  I — Red’s  Expected  Survival  Time  with  Multiple  Moves 


For  this  example,  the  optimal  strategies  for  Red  and  Blue  (in  the  absence  of  the 
Ambush  model)  are 


1 

2 

3 

4 

5 

6 

^  Red -Search 

0 

0 

0.2661 

0 

0.4734 

0.2605 

* 

y  Blue-Search 

0 

0 

0.2661 

0 

0.4734 

0.2605 

Table  6.  Example  ] 

1 — Optimal  Search  Strategies 

26 


Red  can  expect  the  following  survival  time  using  this  strategy: 

V Search  =  XZd -Search ^Blue-Search  =  5'446  '  Note  that  Red  and  Blue  share  the  Same  r0ute 
selection  strategies  in  this  example.  This  will  not  always  be  the  case. 

We  need  to  now  find  the  expected  survival  times  for  Blue  using  the  Ambush 
model.  As  with  the  Search  model,  we  start  by  determining  the  initial  expected  survival 
times  along  each  route  assuming  that  Blue  is  not  allowed  to  change  routes  once  one  is 
selected  (m  =  0). 


m=0 

1 

2 

3 

4 

5 

6 

C° 

3.13 

3.13 

2.55 

4.13 

2.55 

3.13 

E? 

1.82 

1.39 

2.13 

1.82 

1.58 

2.13 

Table  7.  Example  I — Blue’s  Expected  Survival  Time  with  No  Moves 

Since  Blue  faces  less  risk  moving  from  cell  to  cell,  the  expected  times  of  survival 


converge  much  slower  than  for  Red.  By  m  =  35  we  get  convergence  in  the  first  two 
decimal  places. 

Table  8  shows  the  values  of  C,  M,  E,  N,  and  V  as  given  in  Chapter  III. 


m=35 

1 

2 

3 

4 

5 

6 

Cf 

4.78 

4.80 

4.57 

4.79 

4.53 

4.77 

M35 

1.25 

1.24 

0.19 

1.24 

0.21 

1.25 

El f5 

4.45 

4.48 

4.56 

4.47 

4.52 

4.44 

N35 

0 

0 

0 

0 

0 

0 

V,35 

4.45 

4.48 

4.56 

4.47 

4.52 

4.44 

Table  8.  Example  I — Blue’s  Expected  Survival  Time  with  Multiple  Moves 


It  should  not  surprise  us  that  all  N  are  zero.  With  so  little  risk  to  Blue’s 
movements,  he  will  change  routes  immediately,  if  he  finds  himself  on  the  same  route  as 
Red.  Keep  in  mind  that  in  this  game  Blue  is  playing  the  rows,  and  Red  is  playing  the 
columns  of  our  matrix  A,  however,  for  the  sake  of  consistency,  we  will  continue  to  define 
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Blue’s  optimal  strategy  as  y  and  Red’s  as  x.  The  optimal  mixed  strategy  for  each  player 
in  the  absence  of  the  Search  model  thus  becomes 


l 

2 

3 

4 

5 

6 

* 

^  Red -Ambush 

0.2519 

0.2519 

0 

0.2519 

0 

0.2443 

* 

y  Blue-Ambush 

0.2125 

0.3063 

0 

0.2750 

0 

0.2062 

Table  9.  Example  I — Optimal  Ambush  Strategies 


The  value  of  the  Ambush  model  is  VAmbush  =  J'JL, Ambushed -Ambush  =  4-702 

and  this  also  gives  us  the  time  Blue  can  expect  to  survive  without  an  ambush  using  the 
available  routes.  We  note  here  that  VAmhush  <  V search 5  an4  given  these  results  alone,  we 

expect  an  ambush  to  occur  before  Red  is  discovered.  In  this  event,  Blue  may  want  to  find 
additional  routes  or  find  a  way  to  increase  the  rate  of  indirect  detection  in  those  cells  still 
relevant  for  Red  to  use. 

Using  the  Search  Model,  we  can  vary  the  values  for  a  and  p  to  see  the  effect  of 

V 

each.  The  graphs  below  show  the  change  in  Search  ;  as  we  vary  these  parameters  while 
keeping  the  other  one  constant. 


Figure  1 .  a  varies  while  P=.01 
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Figure  2.  P  Varies  while  a=.01  (Same  Graph  -  Different  scales  for  horizontal  axis) 


Clearly,  Blue’s  indirect  efforts  to  detect  Red  (a)  reduce  Red’s  expected  survival 
time  ( V SCa,-ch)  with  greater  efficiency  than  Blue’s  direct  efforts  (P).  As  expected,  as  both 
direct  and  indirect  efforts  increase  they  show  a  decreasing  return  on  their  ability  to  reduce 
Red’s  expected  survival  time.  If  Blue  finds  himself  at  a  disadvantage,  (i.e.,  his  expected 
survival  time  is  less  than  Red’s)  this  analysis  can  help  the  commander  decide  where  to 
place  his  efforts  (direct  or  indirect)  to  get  the  best  reduction  in  Red’s  expected  survival 
time. 


Taking  both  games  into  consideration,  we  obtain  the  following  bi-matrix 
consisting  of  T  and  A7' ;  each  cell  contains  a  value  from  each  individual  payoff  matrix 

(tM)- 


1 

2 

3 

4 

5 

6 

l 

(2.28,  4.45) 

(3.05,  4.80) 

(3.05,  4.57) 

(3.05,  4.80) 

(3.05,  4.53) 

(3.05,  4.77) 

2 

(2.45,  4.78) 

(2.44,  4.48) 

(2.45,  4.57) 

(2.45,  4.80) 

(2.45,  4.53) 

(2.45,  4.77) 

3 

(6.56,  4.78) 

(6.56,  4.80) 

(2.38,  4.56) 

(6.56,  4.80) 

(6.56,  4.53) 

(6.56,  4.77) 

4 

(4.21,4.78) 

(4.21,4.80) 

(4.21,4.57) 

(3.05,  4.48) 

(4.21,4.53) 

(4.21,4.77) 

5 

(6.56,  4.78) 

(6.56,  4.80) 

(6.56,  4.57) 

(6.56,  4.80) 

(4.21,4.52) 

(6.56,  4.77) 

6 

(6.56,  4.78) 

(6.56,  4.80) 

(6.56,  4.57) 

(6.56,  4.80) 

(6.56,  4.53) 

(2.29,  4.44) 

Table  10.  Example  I — Bimatrix  Model 


We  immediately  note  that  there  are  multiple  pure  Nash  Equilibrium  points  in  the 
above  matrix.  The  points  that  yield  the  maximum  survival  time  for  both  are  outlined  in 
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black  above.  If  Blue  and  Red  were  purely  rational  players,  they  would  avoid  each  other 
completely.  Blue  would  choose  either  route  2  or  4  and  Red  would  choose  3,  5  or  6.  Blue 
would  then  obtain  a  value  of  4.80  and  Red  a  value  of  6.56  for  the  bimatrix  game.  Clearly 
Red  has  the  advantage  in  this  scenario.  To  find  the  mixed  Nash  Equilibrium  points  we 
resort  to  non-linear  optimization. 

Why  do  we  need  to  find  mixed  equilibrium,  if  we  already  have  several  pure 
strategies  from  which  to  choose?  If  Red  and  Blue  collaborate  to  ensure  they  both  avoid 
capture/ambush  for  the  longest  possible  time,  then  pure  strategies  are  the  answer. 
However,  both  parties  can  be  less  concerned  about  their  own  survival  time  and  more 
concerned  with  reducing  the  survival  time  of  their  adversary.  We  call  these  different 
strategies  as  risk  adverse  (wishing  to  maximize  one’s  own  survival  time)  and  risk  prone 
(disregarding  self-preservation  in  an  effort  to  reduce  the  others).  By  adjusting  the  initial 
point  in  our  non-linear  programming,  we  arrive  at  different  optimal  mixed  strategies.  In 
the  tables  below,  we  declare  our  starting  point  as  {p ,  q )  where  p  is  Red’s  payoff  from  the 
Search  game  and  q  is  Blue’s  payoff  from  the  Ambush  game,  refer  to  equation  (44). 

We  assume  that  each  player  wishes  to  find  the  optimal  strategy  that  gets  them  as 
close  to  the  value  of  their  individual  games  as  possible.  This  means  we  conduct  our  non¬ 
linear  optimization  from  the  initial  value  of  (5.446,  4.702).  Using  a  software  package  (in 
this  case  MAPLE’s  NLPSolve  command)  we  obtain  the  following  mixed  strategies: 


(5.446,4.702) 

1 

2 

3 

4 

5 

6 

*Re</ 

0 

0 

0.283 

0 

0.717 

0 

* 

y Blue 

0 

1 

0 

0 

0 

0 

Table  11.  Example  I — Optimal  Bimatrix,  Risk  Adverse  Strategies 


Our  players  are  now  using  strategies  that  are  consistent  with  the  pure  strategies 
previously  noted  (and  only  Red  is  using  a  mixed  strategy).  Note  that  they  still  do 
not  choose  routes  that  intersect  with  their  adversary.  Using  these  mixed  strategies, 
we  can  determine  the  value  of  the  game  for  each  player  as: 

VRed  =  x*Tx¥y*  =6.56  and  VBlue  =x*T Ay*  =4.80.  This  should  not  be  surprising  as  the 
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pure  strategies  all  lead  to  the  same  values  of  the  game  for  each  player.  In  practical 
applications,  any  mixed  strategy  involving  the  pure  strategies  will  yield  the  same  result. 
Over  the  course  of  time,  each  player  is  benefited  by  randomizing  their  choices  so  a  mixed 
strategy  is  preferable  over  a  pure  strategy,  especially  when  they  lead  to  the  same  values 
for  the  game. 


What  if  each  player  was  so  aggressive  that  he  was  not  concerned  with  maximizing 
his  own  survival  time?  By  setting  our  initial  point  to  (0,0)  we  get  the  following 
strategies: 


(0,0) 

1 

2 

3 

4 

5 

6 

fit  ed 

0 

0 

0.283 

0 

0.717 

0 

y  Blue 

0 

.643 

0 

.357 

0 

0 

Tab 


e  12.  Example  I — Optimal  Bimatrix,  Risk  Prone  Strategies 


Blue  has  now  adopted  a  mixed  strategy  and  Red’s  strategy  has  not  changed.  The 
values  of  the  games  are  unchanged  at:  VRed  =  xTyYy*  =  6.56  and  VBlue  =  x*rA y*  =  4.80. 
Clearly  there  are  multiple  strategies  for  Red  and  Blue  that  lead  to  the  same  values.  In 
reality,  a  route  that  is  conducive  to  a  successful  ambush  (choke  points  with  lots  of  cover) 
is  also  conducive  to  hiding.  Likewise,  a  route  that  is  favorable  for  a  convoy  (wide  open 
spaces)  is  not  favorable  for  the  enemy  seeking  to  avoid  detection.  Therefore,  it  should 
not  be  surprising  that  Red  and  Blue  seek  out  different  routes. 

Blue  can  use  this  information  to  his  advantage  by  choosing  his  route  strategy 
among  the  mixed  strategies  among  routes  2  and  4  that  lead  to  VBlue  =  4.80  while  avoiding 

routes  3  and  5.  Blue  also  has  the  benefit  of  learning  the  mixed  strategy  Red  will  adopt  in 
the  Nash  Equilibrium.  It  is  important  to  note  that  if  Blue  was  to  use  this  information  to 
send  a  separate  patrol  to  find  Red  using  this  strategy,  it  would  violate  the  assumptions  set 
forth  at  the  beginning  of  this  paper.  The  game  would  then  become  one  of  3  players  (vs. 
2).  Next,  we  will  see  what  happens  when  Red  is  allowed  to  move  with  less  risk. 
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2. 


Low  Risk  of  Movement  for  Red  and  Uniform  a 


In  this  example,  we  will  set  the  probabilities  of  detection/hazard  for  Red  and  Blue 
(indirect  and  direct)  very  close  to  each  other.  This  will  drive  them  to  consider  the  same 
routes.  We  will  also  make  it  less  likely  that  Red  will  be  intercepted  in  transit.  As  in 
Example  I  with  Blue,  both  players  will  have  a  90%  or  greater  probability  of  successfully 
changing  routes.  The  linear  rate  of  increase  in  the  probability  of  detection/hazard  is 
uniform  for  all  players  (a  =  [3  =  y  -  .01) . 

Red  and  Blue’s  initial  probabilities  of  detection  are  given  in  the  following  table 
along  with  the  variables  assigned  to  them  in  Chapters  II  and  III. 


Route 

p  (Search  Active) 

q (Search  Passive) 

j  (Ambus  h  Active) 

r  (Ambush  Passive) 

1 

0.5 

0.3 

0.4 

0.2 

2 

0.5 

0.4 

0.6 

0.1 

3 

0.4 

0.2 

0.4 

0.2 

4 

0.3 

0.2 

0.3 

0.1 

5 

0.3 

0.1 

0.2 

0.1 

6 

0.2 

0.1 

0.3 

0.1 

Table  13.  Example  II — Probabilities  of  Ambush  /  Detection 


As  before,  our  first  step  is  to  determine  the  expected  time  to  capture  /  ambush 
assuming  that  neither  Red  or  Blue  are  allowed  to  change  routes  (m  =0).  Doing  so  using 
the  Search  Model  gives  us 


m=0 

1 

2 

3 

4 

5 

6 

4° 

3.05 

2.37 

63.56 

4.21 

6.56 

6.56 

4° 

1.93 

1.40 

2.37 

3.05 

4.21 

1.93 

Table  14.  Example  II — Red’s  Expected  Survival  Time  with  No  Moves 


Once  again,  we  go  through  the  process  outlined  in  Example  1  to  determine  the 

final  values  for  each  route.  Both  Blue  and  Red  face  little  risk  in  moving  as  given  inTable 

15.  Our  equations  (26),  (27),  (41)  and  (42)  converge  much  slower,  and  we  must  calculate 

larger  values  of  m  before  reaching  a  stable  solution.  As  with  Blue  in  the  first  example, 

we  will  need  m  =  35  to  get  convergence  to  the  first  two  decimal  places. 
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P  and  O 

1 

2 

3 

4 

5 

6 

1 

0 

0.98 

0.96 

0.94 

0.93 

0.92 

2 

0.98 

0 

0.98 

0.96 

0.94 

0.93 

3 

0.96 

0.98 

0 

0.98 

0.96 

0.94 

4 

0.94 

0.96 

0.98 

0 

0.98 

0.96 

5 

0.93 

0.94 

0.96 

0.98 

0 

0.98 

6 

0.92 

0.93 

0.94 

0.96 

0.98 

0 

Table  15.  Example  II — Red  and  Blue’s  Probabilities  for  a  Successful  Move 


The  values  of  A,  T,  B,  D,  and  V  are  found  with  the  equations  given  in  Chapter  II. 


m =3  5 

1 

2 

3 

4 

5 

6 

A,35 

6.86 

6.96 

7.49 

7.13 

7.55 

7.52 

jt35 

0 

0 

4.51 

0 

4.29 

4.34 

B35 

6.86 

6.96 

6.89 

7.13 

7.00 

6.95 

D35 

0 

0 

0 

0 

0 

0 

V35 

6.86 

6.96 

6.89 

7.13 

7.00 

6.95 

Table  16.  Example  II — Red’s  Expected  Survival  Time  with  Multiple  Moves 


For  this  example,  the  optimal  strategies  for  Red  and  Blue  (in  the  absence  of  the 
Ambush  model)  are 


1 

2 

3 

4 

5 

6 

* 

^  Red -Search 

0 

0 

0.3206 

0 

0.3459 

0.3335 

* 

y  Blue-Search 

0 

0 

0.2673 

0 

0.3992 

0.3335 

Table  17.  Example 

I — Optimal  Search  Strategies 

Red  can  expect  the  following  survival  time  using  this  strategy: 
V  =x*T  W*  =7  330 

'  Search  "^Red-Search  -  Blue-Search 
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Doing  the  same  for  Blue,  we  obtain  the  following. 


m=0 

1 

2 

3 

4 

5 

6 

c? 

4.21 

6.56 

4.21 

6.56 

6.56 

6.56 

E,° 

2.37 

1.62 

2.37 

3.05 

4.21 

3.05 

Table 

8.  Examp 

e  II — Blue’s  Expected  Survival  Time  with  No  Moves 

Table  19  shows  the  values  of  C,  M,  E,  N,  and  V  as  given  in  Chapter  III. 


m =3  5 

1 

2 

3 

4 

5 

6 

C35 

7.24 

7.70 

7.43 

7.80 

7.83 

7.80 

M35 

0 

3.82 

0 

3.54 

3.45 

3.54 

E? 

7.24 

7.24 

7.43 

7.39 

7.44 

7.38 

A35 

0 

0 

0 

0 

0 

0 

V35 

7.24 

7.24 

7.43 

7.39 

7.44 

7.38 

Table  19. 

Example 

I — Blue’s  Expected  Survival  Time  wit 

i  Multiple  Moves 

As  with  Blue  in  Example  1,  all  D  and  N  are  zero,  since  there  is  little  risk  to  Red  or 
Blue  to  move,  if  they  find  themselves  in  the  same  cell  as  their  adversary.  We  will 
continue  to  define  Blue’s  optimal  strategy  as  y  and  Red’s  as  x.  The  optimal  mixed 
strategy  for  each  player  considering  only  the  Ambush  model  thus  becomes 


1 

2 

3 

4 

5 

6 

* 

^  Red -Ambush 

0 

0.0426 

0 

0.2862 

0.3872 

0.2841 

* 

y  Blue- Ambush 

0 

0.2253 

0 

0.2545 

0.2659 

0.2543 

Table  20.  Example  I 

— Optimal  Ambush  Strategies 

The  Value  Of  the  Ambush  model  is  VAmbush  =  yJlue- Ambushed -Ambush  =  7-6776 
and  this  also  gives  us  the  time  Blue  can  expect  to  survive  without  an  ambush  using  the 
available  routes.  Unlike  our  previous  example VAmhush  >VS  h,  leading  us  to  believe 

that  Blue  has  a  slight  advantage  in  this  scenario  and  can  expect  to  live  longer. 
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As  before,  we  obtain  the  following  bi-matrix  consisting  of  VF  and  Ar ;  each  cell 
contains  a  value  from  each  individual  payoff  matrix  {y/.  A) . 


1 

2 

3 

4 

5 

6 

1 

(6.86,  7.24) 

(6.86,  7.70) 

(6.86,  7.43) 

(6.86,  7.79) 

(6.86,  7.83) 

(6.86,  7.79) 

2 

(6.96,  7.24) 

(6.96,7.24) 

(6.96,  7.43) 

(6.96,  7.79) 

(6.96,  7.83) 

(6.96,  7.79) 

3 

(7.49,  7.24) 

(7.49,  7.70) 

(6.89,7.43) 

(7.49,  7.79) 

(7.49,  7.83) 

(7.49,  7.79) 

4 

(7.13,  7.24) 

(7.13,  7.70) 

(7.13,  7.43) 

(7.13,7.39) 

(7.13,7.83) 

(7.13,7.79) 

5 

(7.55,  7.24) 

(7.55,  7.70) 

(7.55,  7.43) 

(7.55,  7.79) 

(7.00,  7.44) 

(7.55,  7.79) 

6 

(7.52,  7.24) 

(7.52,  7.70) 

(7.52,  7.43) 

(7.52,  7.79) 

(7.52,  7.83) 

(6.95,7.38) 

Table  2 1 .  Example  II — Bimatrix  Model 


The  pure  Nash  Equilibria  are  highlighted  in  Table  2 1 .  Interestingly,  both  Red  and 
Blue  would  prefer  to  use  Route  5  as  it  provides  them  the  greatest  value  if  their  adversary 
is  not  along  that  route.  To  avoid  confrontation,  which  would  decrease  their  survival  time, 
Red  and  Blue  would  benefit  from  using  the  route  pairs  (5,4),  (6,5),  or  (5,6)  to  maximize 
their  survival  times.  As  mentioned  previously,  a  pure  strategy  is  not  viable  over  time  as  it 
provides  the  adversary  a  clearer  picture  of  where  to  find  you.  We  see  through  our  non¬ 
linear  programming  that  a  mixed  strategy  provides  Red  and  Blue  both  with  greater  values 
for  their  individual  games  though  at  a  risk  that  they  might  take  the  same  route  at  the  same 
time. 


Using  non-linear  optimization  we  will  start  our  search  for  mixed  Nash  Equilibria 
with  the  values  of  the  individual  games  VSearch  =7.330  and  VAmbush  =7.6776.  This 

produces  the  following  optimal  strategies: 


(7.330,7.678) 

1 

2 

3 

4 

5 

6 

'''Rerf 

0 

0 

0.8974 

0 

0.1026 

0 

y Blue 

0 

0 

0 

0.8383 

0.1091 

.0526 

Table  22.  Example  II — Optimal  Bimatrix,  Risk  Adverse  Strategies 
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Using  these  mixed  strategies;  Red  and  Blue  obtain  the  following  values  for  the 
game:  VRed  =  x*7x¥y*  =  7.49  and  VBlue  =  x*rAy*  =  7.79.  Note  how  their  mixed  strategies 

give  them  a  greater  payoff  than  the  pure  strategies.  This  comes  from  an  interesting 
choice  for  how  they  randomly  choose  the  route  they  will  take. 

From  Table  21,  Red’s  route  preference  (from  highest  value  to  lowest)  should  be 
5-6-3-4-2-1.  Blue’s  preference  should  be  5-4-6-1-2-3  where  4  and  6  could  be 
interchanged,  since  they  have  the  same  value.  Interestingly,  the  mixed  strategies  in  Table 
22  show  that  Red  will  avoid  6,  even  though  it  is  his  second  highest  valued  route,  to  avoid 
Blue.  The  values  for  Route  5  are  great  enough  that  both  Red  and  Blue  will  risk  taking 
route  5  approximately  10%  of  the  time  to  increase  their  overall  value  leading  them  to 
increase  their  overall  values  from  the  pure  strategies. 


As  in  Example  1,  we  set  our  initial  value  at  (0,0)  to  represent  a  more  aggressive 
game  where  each  player  wishes  to  obtain  a  strategy  that  brings  their  opponent’s  values  to 
zero.  Doing  so  produces  the  following  optimal  strategies: 


(0,0) 

l 

2 

3 

4 

5 

6 

0 

0 

0.8974 

0 

0.1026 

0 

y  Blue 

0 

0 

0 

0.8383 

0.1091 

.0526 

Tab 


e  23.  Example  II — Optimal  Bimatrix,  Risk  Prone  Strategies 


Clearly,  there  is  no  change  in  optimal  strategies  from  our  previous  initial  starting 
point  and,  therefore,  the  values  of  the  game  for  Red  and  Blue  go  unchanged.  In  fact,  by 
varying  our  initial  point  we  can  see  that  this  mixed  strategy  is  relatively  stable.  Each 
player  can  be  assured  of  the  outcome  regardless  of  the  aggressiveness  of  their  adversary. 
As  before,  Blue  can  then  use  this  information  to  route  his  convoy  using  y*  knowing  which 
routes  he  is  most  likely  to  encounter  Red  on. 
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V.  TOPICS  FOR  FURTHER  RESEARCH 


A.  APPLICATION 

Practical  application  of  this  method  is  reliant  on  detennining  the  initial 
probabilities  of  indirect  and  direct  detection  for  each  route  under  consideration.  While 
this  may  be  impossible  to  do  with  any  certainty,  approximations  based  on  the  relative  risk 
each  route  presents  can  provide  a  starting  point.  This  model  also  offers  the  ability  to 
determine  the  effect,  if  any;  indirect  detection  methods  (rewards,  humanitarian  efforts, 
etc.)  have  on  the  enemy’s  strategy  for  route  selection  in  the  face  of  trying  to  engage  a 
convoy.  Also,  by  assigning  cost  to  both  direct  and  indirect  measures,  the  commander  can 
best  determine  which  investment  returns  the  greatest  survival  times  for  his  convoy. 

B.  POSSIBLE  FOLLOW-ON  RESEARCH 

This  model  provides  only  a  starting  point  in  exploring  the  relationships  between 
the  indirect  and  direct  aggressiveness  each  player  exhibits  in  trying  to  minimize  their 
opponent’s  survival  time  while  maximizing  their  own.  This  model  could  easily  be 
adopted  for  cities  where  Blue  wishes  to  dissuade  enemy  activity  through  both  direct  and 
indirect  means.  In  this  scenario,  an  optimal  control  problem  is  clearly  present.  What  is 
the  balance  of  direct  and  indirect  aggressiveness  that  minimizes  the  time  Red  stays  in  the 
city  that  also  allows  Blue  to  maximize  resources?  Another  avenue  of  research  is  the 
relationship  between  Blue’s  aggressiveness  (a,P)  and  Red’s  aggressiveness  (y).  This  can 
be  applied  to  the  classic  problem  faced  by  law  enforcement.  As  each  player  gets 
increasingly  aggressive  in  their  direct  attempts  to  eliminate  their  opponent,  they  are 
greeted  with  increasing  direct  aggressiveness  from  their  opponent.  A  indirect  aggressive 
approach  may  be  more  appropriate  and  this  model  provides  for  exploring  that  option. 
Finally,  another  possible  research  path  is  to  apply  this  model  to  a  network  of  routes  where 
flow  analysis  can  be  combined  with  the  expected  survival  times  along  each  route  to 
determine  the  best  overall  route  to  take.  In  conclusion,  this  model  can  be  readily  adopted 
for  a  myriad  of  problems  where  two  parties  have  conflicting  goals  and  two  methods  of 
achieving  those  goals. 
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